Server system and operation method thereof

ABSTRACT

An operation method for a server system includes: (A) under control of a hardware abstraction layer (HAL), a plurality of node management units sharing a hardware resource; (B) if If one of the node management units needs to use the hardware resource, the method includes the node management unit sending an instruction or a data to the HAL and accordingly the HAL using the hardware resource in represent of the node management unit; and (C) if. If an external instruction is received, the method includes the HAL identifying which transmission port of the hardware resource receives the external instruction, so to send the external instruction to a corresponding node management unit, and. Then, after the external instruction is executed, the method includes the corresponding node management unit sending back an information to the HAL so that the HAL sends back the information to an external system administrator.

This application claims the benefit of Taiwan application Serial No.99124360, filed Jul. 23, 2010, the subject matter of which isincorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates in general to a server system and an operationmethod thereof.

2. Description of the Related Art

The blade server has been widely used in many fields of application. Ingeneral, several blade servers are assembled in a chassis system so asto provide operation convenience for the user. The blade server clusterstogether the core computing circuits of all server systems in a serverstation. The system administrator maintains and controls the serversystems and the network of the server station, so that the systemadministrator can maintain and control the computer server systemsclustered together.

Currently, the server manages nodes according to the intelligentplatform management interface (IPMI) protocol, and a baseboardmanagement controller (BMC) is used for monitoring the node, recordingthe events and recovering the system error. The node refers to acomputing unit with independent computing ability. The node at leastincludes a central processing unit (CPU) and a memory. For the productscurrently available in the market, one single BMC can only manage onesingle node but not manage a plurality of nodes concurrently. Thechassis system has a hardware chassis management module (CMM) formanaging the entire chassis system.

Since the demand for data center increases along with the development ofcloud technology, how to accommodate more nodes within a limited spaceto increase the computing ability has become an imminent task to the ITindustry.

Examples of the invention disclose a server system and an operationmethod thereof capable of reducing the number of BMC chips forincreasing the internal space of the server so that more nodes can bedisposed and the cost can be reduced.

SUMMARY OF THE INVENTION

Examples of the invention are directed to a server system and anoperation method thereof. Through a hardware abstraction layer (HAL), aplurality of node management units (realized by software andrespectively used for managing a node) of the BMC can share the hardwareresource of the BMC.

According to an embodiment of the present invention, provided is aserver system including at least a system board comprising a baseboardmanagement controller and a plurality of nodes, wherein the baseboardmanagement controller comprises a plurality of node management units, ahardware abstraction layer (HAL) and a hardware resource, and the nodemanagement units respectively manage the nodes and share the hardwareresource under the control of the HAL; a connection port used forconnecting to an external system administrator; and an internal channelconnected to the system board and the connection port.

According to another embodiment of the present invention, provided is anoperation method for a server system comprising at least a system board,the system board comprising a baseboard management controller and aplurality of nodes, the baseboard management controller comprising aplurality of node management units, an HAL and a hardware resource, thenode management units respectively managing the nodes, the operationmethod comprising: (A) sharing the hardware resource by the nodemanagement units under the control of the HAL; (B) transmitting aninstruction or a data to the HAL by the node management unit when one ofthe node management units needs to use the hardware resource, whereinthe HAL uses the hardware resource on behalf of the node managementunit; and (C) if an external instruction is received, the HALidentifying which transmission port of the hardware resource receivesthe external instruction and transmitting the external instruction tothe corresponding node management unit and after the externalinstruction is executed, the corresponding node management unittransmitting an information to the HAL and the HAL further transmits theinformation to an external system administrator via the transmissionport.

The above and other aspects of the invention will become betterunderstood with regard to the following detailed description of thepreferred but non-limiting embodiment(s). The following description ismade with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a chassis system according to an embodiment of theinvention;

FIG. 2 shows a BMC according to the embodiment of the invention;

FIG. 3 shows how a plurality of NMU share the hardware portion of theBMC through an HAL according to the embodiment of the invention; and

FIG. 4A˜FIG. 4C show the transmission of instruction/information throughthe HAL according to the embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

In an embodiment of the invention, one single BMC can manage a pluralityof nodes. By use of a hardware abstraction layer (HAL), the BMC isexpanded from single-node management to multi-node management and isstill conformed to the IPMI protocol. Thus, the number of BMC chips usedin the chassis system is reduced, not only reducing the cost but alsosaving the space and lowering the internal temperature of the chassissystem.

FIG. 1 shows a chassis system according to the embodiment of theinvention. As indicated in FIG. 1, the chassis system 100 at leastincludes a connection port 101, a local area network (LAN) 102, aninter-integrated circuit (I²C) bus 103, and a plurality of systemboards. Although in FIG. 1, the chassis system 100 includes three systemboards 110˜130, but the embodiment of the invention is not limitedthereto. The system board 110 includes a BMC 111 and nodes 112-1˜112-Y,the system board 120 includes a BMC 121 and nodes 122-1˜122-Y, and thesystem board 130 includes a BMC 131 and node 132-1˜132-Y, wherein Y is apositive integer.

The instruction and the signal from the system administrator are furthertransmitted to the corresponding system board through the connectionport 101. Also, the information from the system board is transmitted tothe system administrator through the connection port 101.

As indicated in FIG. 1, the LAN 102 and the I²C bus 103 provide acommunication path between the BMCs of the system boards. In otherembodiment of the invention, the BMC selectively has a chassismanagement module (CMM).

FIG. 2 shows the BMC according to the embodiment of the invention. Asindicated in FIG. 2, the BMC includes a hardware portion and a softwareportion. The software portion includes an HAL 211 and node managementunits (NMU) 212-1˜212-Y, and the hardware portion includes a generalpurpose input/output (GPIO) pin 221, a storage unit 222, a serial port223, a sensing unit 224, a system interface (SI) 225, a LAN interface226 and an I²C interface 227.

For each node, the BMC accesses the reading of the sensing unit 224 soas to monitor physical parameters (such as CPU temperature, memorytemperature, and voltage) of the node. For example, the BMC may havethree CPU temperature sensors for sensing the CPU temperatures of threenodes respectively. Moreover, the BMC controls the ON/OFF state of thesystem through the GPIO pin 221. In addition, the system administratormay transmit an IPMI instruction to the BMC through the LAN interface226 or the system interface 225, for requesting the BMC to execute theIPMI instruction transmitted thereto.

The NMU is a management software unit conformed to the IPMI protocol.That is, in terms of the BMC 111, the NMU 1˜NMU 3 respectively managethe nodes 112-1˜112-3. Since one single BMC manages a plurality ofnodes, a plurality of NMUs must share the hardware portion of the BMC.Thus, the hardware abstraction layer (HAL) 211 is used to resolve theabove issue. For each NMU, the HAL 211 establishes a respective logic(virtual) hardware device mapped to physical hardware device(s).

FIG. 3 shows how a plurality of NMU shares the hardware portion of theBMC through the HAL according to the embodiment of the invention. Asindicated in FIG. 3, when the NMU needs to accesses a sensor data record(SDR), the NMU does not needs to know the physical access address of theSDR of the node in the storage unit 222. When the NMU needs to read theSDR data, the NMU informs the HAL 211 which SDR data (such as the CPUtemperature, the memory temperature, and the applying voltage) of thenode the NMU needs and the HAL 211 transmits the SDR data of thecorresponding node to the NMU. SDR1˜SDR3 respectively denote the SDRdata of the nodes 1˜3, which respectively correspond to the NMU1˜NMU3.

Similarly, when the NMU needs to store the SDR data, the NMU does notneed to know the physical storage address of the SDR of the node in thestorage unit 222. When the NMU needs to store the SDR data, the NMUtransmits the to-be-stored SDR data to the HAL 211 which accordinglystores the SDR data to the storage unit 222. That is, the HAL 211 mapsdata to be accessed or stored by the NMU to the storage unit 222.

A system event log (SEL) is used for storing the events (such as systemabnormality) of the node. Similarly, when the NMU 1˜NMU 3 need to accessSEL 1˜SEL 3, the HAL 211 accesses the storage unit 222 like the abovedisclosure. A field replaceable unit (FRU) is used for recording systeminformation such as the number of the system board and the product name.Similarly, when the NMU 1˜NMU 3 need to access the FRU 1˜FRU 3, the HAL211 accesses the storage unit 222 like the above disclosure.Furthermore, data map by the HAL 211 is not limited to SDR, SEL and FRU.Other functions in the IPMI protocol, such as serial over LAN (SOL),platform event filter (PEF), sensor monitor and chassis control etc canbe mapped or transmitted by the NMU through the HAL.

FIG. 4A˜FIG. 4C show the transmission of instruction/information throughthe HAL according to the embodiment of the invention. As indicated inFIG. 4A, the communication between the system administrator 410 and theHAL 211 is bi-directional, and so is the communication between the HAL211 and the NMU.

FIG. 4B shows the system administrator 410 transmitting an IPMIinstruction to the BMC through the HAL 211. As indicated in FIG. 4B, thesystem administrator 410 transmits an IPMI instruction to the HAL 211.Then, the HAL 211 judges whether the IPMI instruction is transmittedthrough a system interface (SI) (as indicated in step 421) or through anLAN interface (as indicated in step 422). If the IPMI instruction istransmitted through the SI, then the HAL 211 judges whether the IPMIinstruction is transmitted through the first transmission port SI 1(which is corresponding to the node 1) of the SI, the secondtransmission port SI 2 (which is corresponding to the node 2) of the SIor the third transmission port SI 3 (which is corresponding to the node3) of the SI, as indicated in step 431˜433. In the present embodiment ofthe invention, the system interface of the BMC has a plurality of SItransmission ports, and three SI transmission ports are used forconnecting the BMC to the system administrator 410. If the IPMIinstruction is transmitted through the LAN interface, then the HAL 211judges whether the IPMI is transmitted through the first transmissionport LAN 1 (which is corresponding to the node 1) of the LAN interface,the second transmission port LAN 2 (which is corresponding to the node2) of the LAN interface or the third transmission port LAN 3 (which iscorresponding to the node 3) of the LAN interface, as indicated in step434˜436. In the present embodiment of the invention, the LAN interfaceof the BMC has a plurality of LAN transmission ports, and three LANtransmission ports are used for connecting the BMC to the systemadministrator 410. After the judgment steps 431˜436, the HAL determineswhich of the NMU 1˜NMU 3 should the IPMI instruction from the systemadministrator 410 be transmitted to, and the HAL 211 accordinglytransmits the IPMI instruction to the target NMU.

FIG. 4C shows the BMC transmits information to the system administrator410 through the HAL 211. After the NMU receives the IPMI instructionfrom the system administrator 410, the NMU performs correspondingoperation, and then transmits the response information back to thesystem administrator 410 through the HAL 211. As indicated in FIG. 4C,the NMU transmits the response information to the HAL 211. Next, the HAL211 judges whether the response information is received through thesystem interface (SI) (as indicated in step 441) or the LAN interface(as indicated in step 442). If the response information is receivedthrough the system interface, then the HAL 211 analyzes the receivedresponse information, and identifies which NMU issues the responseinformation (steps 451˜453 and steps 454˜456). In the present embodimentof the invention, the system interface of the BMC has a plurality of SItransmission ports, and three SI transmission ports are used forconnecting the system administrator 410 to the BMC. The LAN interface ofthe BMC has a plurality of LAN transmission port, and three LANtransmission ports are used for connecting the system administrator 410to the BMC. The HAL 211 judges whether the response information istransmitted through the system interface, and then further identifieswhich NMU issues the response information (steps 451˜453). Thus, the HAL211 can transmit the response information back to the systemadministrator 410 through the interface (such as SI) which originallyreceives the information (step 461˜463). Similarly, HAL 211 judgeswhether the NMU transmits the response information through the LANinterface, and then identifies which NMU issues the response information(steps 454˜456), thus the response information is transmitted back tothe system administrator 410 through the interface (LAN interface) whichoriginally receives the information (steps 464˜466).

In the embodiment of the invention, when the system administrator 410sends the IPMI instruction to the BMC through the LAN interface or thesystem interface, the HAL 211 identifies which transmission portreceives the IPMI instruction, and transmits the instruction to acorresponding NMU. After the instruction is executed by the NMU, the NMUtransmits the information back to the HAL 211, which accordinglytransmits the response information back to the system administrator 410through the original transmission port. However, the embodiment of theinvention is not subjected to the above exemplification that the HAL 211has to transmit the IPMI instruction through the LAN interface or thesystem interface. In other embodiments of the invention, the HAL 211 canalso transmit the IPMI instruction through other interface supported bythe IPMI protocol.

To summarize, the embodiment of the invention has at least the followingadvantages. (1) The number of BMC chips in a high-density server (suchas a blade server) is reduced, so that the cost is reduced accordingly.(2) Space is utilized more effectively, the number of nodes andcomputing ability of the server are higher, and system temperature islowered (due to the decrease in the number of BMC chips).

While the invention has been described by way of example and in terms ofthe preferred embodiment(s), it is to be understood that the inventionis not limited thereto. On the contrary, it is intended to cover variousmodifications and similar arrangements and procedures, and the scope ofthe appended claims therefore should be accorded the broadestinterpretation so as to encompass all such modifications and similararrangements and procedures.

What is claimed is:
 1. A server system, comprising: at least a systemboard comprising a baseboard management controller and a plurality ofnodes, wherein the baseboard management controller comprises a pluralityof node management units, a hardware abstraction layer (HAL) and ahardware resource, and the node management units respectively manage thenodes and share the hardware resource under the control of the HAL; aconnection port used for connecting to an external system administrator;and an internal channel connected to the system board and the connectionport; the at least a system board further comprising a plurality oftransmission ports, the baseboard management controller being connectedto the external system administrator via the transmission ports, whereinifafter an external instruction is transmitted to the baseboardmanagement controller through the hardware resource, then the HALidentifies which transmission port receives the external instruction andtransmits the external instruction to the corresponding node managementunit, and after the corresponding node management unit executes theexternal instruction, the corresponding node management unit transmitsan information to the HAL for transmitting the information back to theexternal system administrator via the transmission port.
 2. The serversystem according to claim 1, wherein for each node management unit, theHAL establishes a logic hardware device mapped to the hardware resource.3. The server system according to claim 2, wherein if one of the nodemanagement units needs to use the hardware resource, the node managementunit transmits an instruction to the HAL and the HAL accesses thehardware resource according to the instruction and transmits a result tothe node management unit.
 4. The server system according to claim 2,wherein when one of the node management units needs to use the hardwareresource, the node management unit transmits a data to the HAL and theHAL accesses the hardware resource according to the data.
 5. Anoperation method for a server system comprising at least a system board,the system board comprising a baseboard management controller and aplurality of nodes, the baseboard management controller comprising aplurality of node management units, a hardware abstraction layer (HAL)and a hardware resource, the node management units respectively managingthe nodes, the operation method comprising: (A) sharing the hardwareresource by the node management units under the control of the HAL; (B)transmitting an instruction to the HAL by the node management unit whenone of the node management units needs to use the hardware resource,wherein the HAL uses the hardware resource on behalf of the nodemanagement unit, accesses the hardware resource according to theinstruction and transmits a result to the node management unit; and (C)if after an external instruction is received, the HAL identifyingidentifies which transmission port of the hardware resource receives theexternal instruction and transmitting transmits the external instructionto the corresponding node management unit and, after the externalinstruction is executed, the corresponding node management unittransmitting transmits an information to the HAL and the HAL furthertransmits the information to an external system administrator via thetransmission port.
 6. The operation method according to claim 5,wherein, the step (A) comprises: for each node management unit,establishing a logic hardware device mapped to the hardware resource bythe HAL.
 7. The server system according to claim 2, wherein when one ofthe node management units needs to use the hardware resource, the nodemanagement unit transmits a data to the HAL and the HAL accesses thehardware resource according to the data.
 8. The server system accordingto claim 1, wherein the HAL identifies which one of the plurality oftransmission ports received the external instruction.
 9. The serversystem according to claim 1, wherein the external instruction istransmitted to the baseboard management controller through the hardwareresource.
 10. The server system according to claim 1, wherein the serversystem comprises at least three system boards.
 11. The operation methodaccording to claim 6, wherein when one of the node management unitsneeds to use the hardware resource, the node management unit transmitsan instruction to the HAL and the HAL accesses the hardware resourceaccording to the instruction and transmits a result to the nodemanagement unit.
 12. The operation method according to claim 6, whereinwhen one of the node management units needs to use the hardwareresource, the node management unit transmits a data to the HAL and theHAL accesses the hardware resource according to the data.
 13. Theoperation method according to claim 5, wherein the information comprisesresponse information regarding execution of the eternal instruction. 14.The operation method according to claim 5, wherein the server systemcomprises at least three system boards.